翻訳と辞書
Words near each other
・ Hamsageethe
・ Hamsaladeevi
・ Hamsalekha
・ Hamsalekha discography
・ Hamsanadam
・ Hamsanandi
・ HAMSAT
・ Hamsavahini Vidyalaya
・ Hamsavardhan
・ Hamse Abdouh
・ Hamserish
・ Hamsey
・ Hamsey Green
・ Hamshahri
・ Hamshahri (disambiguation)
Hamshahri Corpus
・ Hamshaw
・ Hamsheni mani
・ Hamsherian
・ Hamshire, Texas
・ Hamshire-Fannett High School
・ Hamshire-Fannett Independent School District
・ Hamsij
・ Hamsika Iyer
・ Hamsin (film)
・ Hamsiraji Marusi Sali
・ Hamskerpir and Garðrofa
・ Hamskifte
・ Hamsok
・ Hamsopanishad


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Hamshahri Corpus : ウィキペディア英語版
Hamshahri Corpus

The Hamshahri Corpus is a sizable Persian corpus based on the Iranian newspaper ''Hamshahri'', one of the first online Persian newspapers in Iran. It was in initially collected and compiled by Ehsan Darrudi at DBRG Group〔(DBRG News ) Database Research Group〕 of University of Tehran. Later a team headed by Ale Ahmad 〔(Hamshahri ) Database Research Group〕 build on this corpus and created the first Persian Text Collection suitable for information retrieval evaluation tasks.
This corpus was created by crawling the online news articles from the Hamshahri's website and processing the HTML pages to create a standard text corpus for modern Information Retrieval experiments.
== Version 1.0 ==
The collection contains more than 160,000 articles covering the following subject categories: politics, city news, economics, reports, editorials, literature, sciences, Society, foreign news, sports, etc. The size of the documents varies from short news (under 1 KB) to rather long articles (e.g. 140 KB) with the average of 1.8 KB.
The corpus is available in several formats for download:〔
* Tagged Text: 560 MB
* In SQL Server 2000 Tables: 712 MB

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Hamshahri Corpus」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.